Data Conversions
========================

Modeling your data and getting it in the right shape may require various conversions, depending your starting point and/or your goal. In ``bnlearn`` various functionalities are readily implemented to make conversions *from* or *to* the **adjacency matrix** or **vectors**.

Available functionalities:

	* adjmat2dict : Convert adjacency matrix to dictionary.
	* adjmat2vec : Convert adjacency matrix into vector with source and target.
	* vec2adjmat : Convert source-target edges with its weights into an adjacency matrix.
	* dag2adjmat : Convert model into adjacency matrix.
	* vec2df : Convert source-target edges into sparse dataframe.


Adjacency matrix
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The adjacency matrix is used to store relationships across source-target variables (nodes) with its edges.
In graph theory, a square matrix is used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph. 
``bnlearn`` outputs an adjacency matrix in some functionalities. Values 0 or False indicate that nodes are not connected whereas pairs of vertices with value >0 or True are connected.


**Importing a DAG**

Extracting adjacency matrix from imported DAG:

.. code-block:: python
   
   # Import library
   import bnlearn as bn
   # Import DAG
   model = bn.import_DAG('sachs')
   # Show the retrieved adjacency matrix for Sachs:
   model['adjmat']

   # print
   print(model['adjmat'])

Reading the table from left to right, we see that gene Erk is connected to Akt in a directed manner. 
This indicates that Erk influences gene Ark but not the otherway arround because gene Akt does not show a edge with Erk. In this example form, there may be a connection at the "...".

.. table::

  +------+-----+------+------+------+------+-----+-----+------+------+------+------+
  |      |  Erk|   Akt|   PKA|   Mek|   Jnk| ... |  Raf|   P38|  PIP3|  PIP2|  Plcg|
  +------+-----+------+------+------+------+-----+-----+------+------+------+------+
  |Erk   |False| True | False| False| False| ... |False| False| False| False| False|
  +------+-----+------+------+------+------+-----+-----+------+------+------+------+
  |Akt   |False| False| False| False| False| ... |False| False| False| False| False|
  +------+-----+------+------+------+------+-----+-----+------+------+------+------+
  |PKA   |True | True | False| True | True | ... |True | True | False| False| False|
  +------+-----+------+------+------+------+-----+-----+------+------+------+------+
  |Mek   |True | False| False| False| False| ... |False| False| False| False| False|
  +------+-----+------+------+------+------+-----+-----+------+------+------+------+
  |Jnk   |False| False| False| False| False| ... |False| False| False| False| False|
  +------+-----+------+------+------+------+-----+-----+------+------+------+------+
  |PKC   |False| False| True | True | True | ... |True | True | False| False| False|
  +------+-----+------+------+------+------+-----+-----+------+------+------+------+
  |Raf   |False| False| False| True | False| ... |False| False| False| False| False|
  +------+-----+------+------+------+------+-----+-----+------+------+------+------+
  |P38   |False| False| False| False| False| ... |False| False| False| False| False|
  +------+-----+------+------+------+------+-----+-----+------+------+------+------+
  |PIP3  |False| False| False| False| False| ... |False| False| False| True | False|
  +------+-----+------+------+------+------+-----+-----+------+------+------+------+
  |PIP2  |False| False| False| False| False| ... |False| False| False| False| False|
  +------+-----+------+------+------+------+-----+-----+------+------+------+------+
  |Plcg  |False| False| False| False| False| ... |False| False| True | True | False|
  +------+-----+------+------+------+------+-----+-----+------+------+------+------+

Vector
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The **vector** is used to store relationships based on source-target variables (nodes), and with its weigths.
An example is illustrated below for which edges are defined when weights are True or a number >=1.

.. table::

   +------------+------------+---------+
   |  source    |  target    | weight  |
   +------------+------------+---------+
   |  Cloudy    | Sprinkler  | True    |
   +------------+------------+---------+
   |  Cloudy    | Rain       | True    |
   +------------+------------+---------+
   |  Sprinkler | Wet_Grass  | True    |
   +------------+------------+---------+
   |  Rain      | Wet_Grass  | True    |
   +------------+------------+---------+


adjmat2vec
^^^^^^^^^^^^
Converting an adjacency matrix into vector with :func:`bnlearn.bnlearn.adjmat2vec`

.. code-block:: python
   
   import bnlearn as bn
   # Load DAG
   DAG = bn.import_DAG('Sprinkler')
   # Convert adjmat to vector:
   vector = bn.adjmat2vec(DAG['adjmat'])

.. table::

   +------------+------------+---------+
   |  source    |  target    | weight  |
   +------------+------------+---------+
   |  Cloudy    | Sprinkler  | True    |
   +------------+------------+---------+
   |  Cloudy    | Rain       | True    |
   +------------+------------+---------+
   |  Sprinkler | Wet_Grass  | True    |
   +------------+------------+---------+
   |  Rain      | Wet_Grass  | True    |
   +------------+------------+---------+


vec2adjmat
^^^^^^^^^^^^
Converting the created vector in the example above back into an adjacency matrix with :func:`bnlearn.bnlearn.vec2adjmat`


.. code-block:: python
   
   import bnlearn as bn
   # Convert vector back to adjmat.
   adjmat = bn.vec2adjmat(vector['source'], vector['target'], weights=vector['weight'])

.. table::

	+-----------+--------+-------------+-------------+----------+
	| source    |   Rain |   Sprinkler |   Wet_Grass |   Cloudy |
	+===========+========+=============+=============+==========+
	| Rain      |      0 |           0 |           1 |        0 |
	+-----------+--------+-------------+-------------+----------+
	| Sprinkler |      0 |           0 |           1 |        0 |
	+-----------+--------+-------------+-------------+----------+
	| Wet_Grass |      0 |           0 |           0 |        0 |
	+-----------+--------+-------------+-------------+----------+
	| Cloudy    |      1 |           1 |           0 |        0 |
	+-----------+--------+-------------+-------------+----------+


adjmat2dict
^^^^^^^^^^^^
Convert adjacency matrix to dictionary with :func:`bnlearn.bnlearn.adjmat2dict`


.. code-block:: python
   
   # Import library
   import bnlearn as bn
   # Load DAG
   DAG = bn.import_DAG('Sprinkler')
   # Convert adjmat to vector:
   adjmat_dict = bn.adjmat2dict(DAG['adjmat'])
   # print
   print(adjmat_dict)

   # {'Cloudy': ['Sprinkler', 'Rain'],
   #  'Sprinkler': ['Wet_Grass'],
   #  'Rain': ['Wet_Grass'],
   #  'Wet_Grass': []}


dag2adjmat
^^^^^^^^^^^^
Convert model into adjacency matrix with :func:`bnlearn.bnlearn.dag2adjmat`

.. code-block:: python
   
   # Import library
   import bnlearn as bn
   # Load DAG
   DAG = bn.import_DAG('Sprinkler')
   # Extract edges from model and store in adjacency matrix
   adjmat=bn.dag2adjmat(DAG['model'])

.. table::

	+-----------+--------+-------------+-------------+----------+
	| source    |   Rain |   Sprinkler |   Wet_Grass |   Cloudy |
	+===========+========+=============+=============+==========+
	| Rain      |      0 |           0 |           1 |        0 |
	+-----------+--------+-------------+-------------+----------+
	| Sprinkler |      0 |           0 |           1 |        0 |
	+-----------+--------+-------------+-------------+----------+
	| Wet_Grass |      0 |           0 |           0 |        0 |
	+-----------+--------+-------------+-------------+----------+
	| Cloudy    |      1 |           1 |           0 |        0 |
	+-----------+--------+-------------+-------------+----------+

vec2df
^^^^^^^^^^^^
Convert edges between source and taget into a dataframe based on the weight with :func:`bnlearn.bnlearn.vec2df`
For demonstration purposes, A small example is created below for which can be seen that the weights are indicative for the number of rows; a weight of 2 will result that a row with the edge is created 2 times.

.. code-block:: python
   
   # Import library
   import bnlearn as bn
   # Create source-target edges with its weights
   source=['Cloudy','Cloudy','Sprinkler','Rain']
   target=['Sprinkler','Rain','Wet_Grass','Wet_Grass']
   weights=[1,2,1,3]
   # Convert into sparse dataframe.
   df = bn.vec2df(source, target, weights=weights)

.. table::

	+----+----------+--------+-------------+-------------+
	|    |   Cloudy |   Rain |   Sprinkler |   Wet_Grass |
	+====+==========+========+=============+=============+
	|  0 |        1 |      0 |           1 |           0 |
	+----+----------+--------+-------------+-------------+
	|  1 |        1 |      1 |           0 |           0 |
	+----+----------+--------+-------------+-------------+
	|  2 |        1 |      1 |           0 |           0 |
	+----+----------+--------+-------------+-------------+
	|  3 |        0 |      0 |           1 |           1 |
	+----+----------+--------+-------------+-------------+
	|  4 |        0 |      1 |           0 |           1 |
	+----+----------+--------+-------------+-------------+
	|  5 |        0 |      1 |           0 |           1 |
	+----+----------+--------+-------------+-------------+
	|  6 |        0 |      1 |           0 |           1 |
	+----+----------+--------+-------------+-------------+


To demonstrate the full functionality A larger example can be loaded containing 352 edges from the book A Storm of Swords.
The results is that 107 unique names are extracted with 4324 edges. This dataframe can for example be an input for structure learning approaches.

.. code-block:: python
   
   # Import library
   import bnlearn as bn
   # Load large example with source-target edges from the book A Storm of Swords 
   vec = bn.import_example("stormofswords")
   # Convert into sparse dataframe.
   df = bn.vec2df(vec['source'], vec['target'], weights=vec['weight'])
   # sparse matrix:
   print(df.shape)
   # (4324, 107)


.. include:: add_bottom.add